ACL 2013 Predicting and Improving Text Readability for Target Reader Populations

نویسندگان

  • Sandra Williams
  • Advaith Siddharthan
  • Ani Nenkova
  • Siobhan Devlin
چکیده

In this paper, we introduce a syntax-based sentence simplifier that models simplification using a probabilistic synchronous tree substitution grammar (STSG). To improve the STSG model specificity we utilize a multi-level backoff model with additional syntactic annotations that allow for better discrimination over previous STSG formulations. We compare our approach to T3 (Cohn and Lapata, 2009), a recent STSG implementation, as well as two state-of-the-art phrase-based sentence simplifiers on a corpus of aligned sentences from English and Simple English Wikipedia. Our new approach performs significantly better than T3, similarly to human simplifications for both simplicity and fluency, and better than the phrasebased simplifiers for most of the evaluation metrics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simple, readable sub-sentences

We present experiments using a new unsupervised approach to automatic text simplification, which builds on sampling and ranking via a loss function informed by readability research. The main idea is that a loss function can distinguish good simplification candidates among randomly sampled sub-sentences of the input sentence. Our approach is rated as equally grammatical and beginner reader appro...

متن کامل

Influence of Target Reader Background and Text Features on Text Readability in Bangla: A Computational Approach

In this paper, we have studied the effect of two important factors influencing text readability in Bangla: the target reader and text properties. Accordingly, at first we have built a novel Bangla readability dataset of 135 documents annotated by 50 readers from two different backgrounds. We have identified 20 different features that can affect the readability of Bangla texts; the features were...

متن کامل

Readability Indices for Automatic Evaluation of Text Simplification Systems: A Feasibility Study for Spanish

This paper addresses the problem of automatic evaluation of text simplification systems for Spanish. We test whether already-existing readability formulae would be suitable for this task. We adapt three existing readability indices (two measuring lexical complexity and one measuring syntactic complexity) to be computed automatically, which are then applied to a corpus of original news texts and...

متن کامل

On Improving the Accuracy of Readability Classification using Insights from Second Language Acquisition

We investigate the problem of readability assessment using a range of lexical and syntactic features and study their impact on predicting the grade level of texts. As empirical basis, we combined two web-based text sources, Weekly Reader and BBC Bitesize, targeting different age groups, to cover a broad range of school grades. On the conceptual side, we explore the use of lexical and syntactic ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013